RNA matrix models with external interactions and their asymptotic behaviour 
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We study a matrix model of RNA in which an external perturbation acts on n nucleotides of the 
polymer chain. The effect of the perturbation appears in the exponential generating function of 
the partition function as a factor (1 — ^) [where oi is the ratio of strengths of the original to the 
perturbed term and L is length of the chain]. The asymptotic behaviour of the genus distribution 
functions for the extended matrix model are analyzed numerically when (i) n = L and (ii) n = 1. In 
these matrix models of RNA, as na/L is increased from to 1, it is found that the universality of 
the number of diagrams aL,g at a fixed length L and genus g changes from 3^ to (3 — (2^ when 

na/L = 1) and the asymptotic expression of the total number of diagrams A/" at a fixed length L but 
' independent of genus g, changes in the factor exp^ to exp'^~~'^ (exp" = 1 when na/L = 1). 

(N 

, PACS numbers; 02.10.Yn, 87.14.gn, ll.lO.Jj, 87.15.-v 
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I. INTRODUCTION 

Improved understanding of the process of folding of RNA finds its ultimate use in the prediction of the fully folded, 
PQ , partially folded and completely unfolded structures under physiological conditions Under these conditions, un- 
folding is a very slow process as compared to folding in the presence of a force. Application of a force increases the 
unfolding rate and we can therefore get the unfolded structures from the folded ones ([l[ and references therein). Ex- 
perimental techniques of force induced measurements have proved successful in probing properties related to dijferent 
aspects of RNA folding and unfolding, domain unfolding in proteins, in polysachharides and nucleic acids ([2] and 
references therein). Experiments have been performed on the double helixed DNAs to study their elastic and struc- 
tural properties using electric field, hydrodynamic flow among other methods of force application (Q and references 
therein). The advent of AFM technique served as an important tool in the study of the basic underlying framework 

■ of molecular structural biology. Over the years, optical tweezers and AFM (atomic force microscopy) techniques 
have been employed to study the physical, elastic and structural properties of the biomolecules by recording their 
force extension curves (FECs) and studying the force dependent dynamics and folding landscapes of the molecules 
( 4, 5, 6, 7, 8, £] and references therein). The conformations of biopolymers (DNA, RNA and proteins) which are 

. otherwise not accesible from the conventional methods of measurements: NMR spectroscopy and X-ray crystallogra- 
' phy, are possible with the use of AFMs. These conformations help in revealing the underlying mechanical framework 
OO , of the biological systems (Q and references therein). Mechanical unfolding and refolding of single RNA has been 

■ studied using force-ramp, hopping and force-jump methods (^10'| and references therein). In mechanical unfolding 
^ ] experiments, it has been observed that at a critical value of the applied force, the hairpin structure toggles between 

■ the folded and the unfolded states fill. Hp . [T3 |. In these experiments, ionic concentrations play an important role. 
' Experiments of Bustamante et al [12, T3"| have shown that the denaturation of RNA by a constant force involves 

^ 1 multiple trajectories (for RNA hairpins and Tetrahymena thermophila ribozyme) while undergoing a transition from 
- - ' the folded structure state to the unfolded state. These trajectories depend on the point at which the force is applied 
P, [13] ■ This diverseness in the folding- unfolding pathways is due to the rugged energy landscape of RNA (consisting 
of many minima). Controlled/monitored force loading and unloading rates can be used to manipulate the single 
molecules of RNA into either their native or misfolded pathways. Different force unloading rates in experiments on 
TAR RNA molecules showed different types of trajectories associated with particular refolding characteristics ( [isj 
and references therein). 

We discuss here very briefly, a generalization of the extended random matrix model of RNA folding proposed in 
[T^ where the external perturbation acts on a single nucleotide {n — 1) and on n nucleotides (n < L) in the polymer 
chain (we will refer to the two RNA models as 1-NP RNA model, with NP being Nucleotide Perturbation and n-NP 
RNA model respectively). In fl^, the external perturbation acted on all the nucleotides in the polymer chain ( i.e., 
n = L, n is the number of bases on which the force is acting). We briskly outline the extended matrix model of [l6[ 
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for completeness and understanding and follow it up with results and comparative discussion for the 1-NP and n-NP 
models. Further, we present a detailed numerical analysis for the asymptotics of the extended matrix model of RNA 
with perturbation on all the nucleotides in the polymer chain. The genus distribution functions: the total number 
of diagrams at a fixed length L but independent of genus g, A/" and the number of diagrams at a fixed length L and 
genus g, a^ g of the matrix model of RNA in [l7l.[l8| arc found to change in the presence of the external perturbation. 
We extend our numerical asymptotic analysis to the n-NP RNA model as well. 

II. EXTENDED MATRIX MODELS OF RNA 

We review here, the effect when a perturbation acts on all the nucleotides in the polymer chain (n = L) studied 
in p^ . The nucleotide- nucleotide interaction partition function of the polymer chain with a perturbation on all the 
bases is 

where Al{N) = / nf=i #*ea:]5"^ ^^^=i(^"')''^'^'''^'"*^ ea;p(-^) 2:f=i(M'-').Tr0, ^-^^ normalization constant, 

exp~^^i=^''^ jii^j-^i is perturbation term, Vij is an (LxL) symmetric matrix containing information on the 
interactions between the L nucleotides at positions i and j in the polymer chain, 0^ are L independent (NxN) hermi- 
tian matrices and the observable Jli(l + 'Pi) is an ordered product over 0i's. We consider Vi^j = v and Wi = w where 
V gives the strength of interaction between the nucleotides at positions i and j (in these models, interaction between 
any two nucleotides of the chain is considered the same and equal to v) and w gives the strength of the perturbation. 
Carrying out a series of Hubbard Stratonovich Transformations, the integral over L matrices in eq. ([T]) reduces to 
an integral over a single (NxN) hermitian matrix a 

^^■"^^^ ^ i?ZM / d<^^'^P'^'^''^-^''^'^Tril + a)^ (2) 

where Rl{N) = J daexp~^'^^^^~^'^^ . Following the algebra in [l^ (from eq.5 to eq.l5), the exponential generating 
function G{t,N,a) of the partition function Zl cc{N) is 

G{t, N,a) = J2 ZlAN)j^, - expw+t(i~") 
where a = ^ gives the ratio of strengths of the original to the perturbed term. 

For a = 0, the extended matrix model of RNA folding reduces to the random matrix model in [3]. However, 
for a = 1 it is observed that the partition function for odd lengths of the polymer chain vanishes completely. In 
the extended matrix model, each unpaired base of the polymer chain in the contact diagrams is associated with a 
factor (1 — a) which becomes zero when a = 1 thus removing structures with any unpaired bases. We can therefore 
divide the structures into two regimes: (i) a < 1 comprising of both the unpaired and paired base structures and 
(ii) a = 1 comprising of only completely paired base structuresjwhere only structures with fully paired bases remain 
whereas structures with any unpaired bases are suppressed) [l6|. The genus distributions for the extended matrix 
model are therefore significantly different for different a's, especially for a = 1 [where ^l,q(A^) — for odd lengths of 
the polymer chain] as compared with the model of [l^. The addition of a perturbation has thus changed the genus 
distributions and the overall enumeration of the structures given by this model. 

A. EXTENDED MATRIX MODEL OF RNA WITH PERTURBATION ON A SINGLE BASE (1-NP) 

AND n BASES (n-NP) 

We now consider a generalization of the extended matrix model proposed in [l^ by adding a perturbation to a 
single nucleotide in the polymer chain only, n = 1 (1-NP). The motivation comes from the force induced experiments 
in obtaining important characteristics of folding and unfolding of RNAs discussed in the introduction P, |3, S S S 
S S B B EOi Ell [Hi [Hi [3i [iBl • We keep all the assumptions the same as for the model in [H| ■ The interaction 
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FIG. 1: (a) Plot of the asymptotic formula of J\f in [3l (red dotted curve) with the numerically calculated J\fa values for 
different lengths L corresponding to a = 0.75 (boxed curve). 

(a') The new asymptotic formula of Af^ (red dotted curve) for the extended matrix model of RNA is plotted with the 
numerical Afa values for different lengths L for a = 0.75 (boxed curve). 



partition function ZL^a{N) for 1-NP will be given by eq. ([T]) with the perturbation term now being exp 

and the normalization constant given by Al(A^)=/ nf=i c^'/'^exp"^ ^''.3=i(^"')'-^^'''^''^^ea;p(~^)(^"')i^'''^i. Carrying 
out a similar mathematical analysis employed in going from eq. (1) to eq. (3) above we can write the exponential 
generating function of the partition function as in eq. (3) with the only difference being that a in eq. (3) gets replaced 
by J- for the 1-NP. This implies that when ^ = i.e., no perturbation is acting, we get the matrix model of Ts!]. 
When ^ = 1, we get the extended matrix model with perturbation on all the bases [l^. The 1-NP partition functions 
ZL.a{N) for different L can be found exactly from the exponential generating function [eq. ^ with a being replaced 
by y] by equating the coefficients of powers of t on both the sides of the equation. In general, if the number of 
bases with the perturbation is n then a is replaced by When n — L, we get the extended matrix model with 
perturbation on all the bases, discussed briefly here, [eq. ([TJ-eq. ^] and in detail in [l6l |. 

The diagrammatic representation of the n-NP differs from the diagrammatics of the model with perturbation on 
all the nucleotides in the factor (1 — ^) associated with each unpaired base which replaces the factor (1 — a) in the 
contact diagrams of figure 1 in [l6( . 



III. ASYMPTOTICS OF THE EXTENDED MATRIX MODELS FROM NUMERICS 



The asymptotic behaviour of the genus distribution functions for the matrix model of RNA studied in |18j showed 
universal characteristics. We investigate here numerically, the changes that the genus distribution functions: (i) the 
total number of diagrams at a fixed length L but independent of genus g, JV [defined as JV=Zl{N = 1)] and (ii) the 
number of diagrams at a fixed length L and genus g, aL,g [defined through Zl{N)—J2'^=o ^i-,9W^^ model in 

[l^ undergo when a perturbation is added to these models. The asymptotics of the genus distribution functions are 
computed for the extended matrix model (i) with perturbation on all the bases, n = L [Toj and (ii) with perturbation 
on n bases, (n-NP). We will represent the genus distribution functions for the different matrix models as follows: 
(i) M and ql^q will represent the asymptotic formulae for the model in [l^, (ii) A/"^ and a'^^^g^a ^^^^ represent the 
new asymptotic formulae for the extended matrix model of RNA [Toj and (iii) A/'a and a,L,g,a will represent the 
numerical values of the genus distribution functions for different a's. We start with the exact asymptotic expressions 

(i) N — exp^~ / \/2 and (ii) aL,g — kg3^L'^^^~^^ from and compare the behaviour of Ma and aL,g,a 
for the extended matrix model for lengths upto L — AQ for different values of a{— 0, 0.25, 0.5, 0.75, 1). We begin by 
studying J^a and UL.g.a for the extended models with perturbation on all the bases (n — L). 
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FIG. 2: [LogMa - |LLog L + -j) verses \/L plots for different values of a along with their linearly fitted slopes: (a) a = 
(slope=0.9818), (b) a = 0.25 (slope=0.7359), (c) a = 0.5 (slope=0.4926), (d) a = 0.75 (linear fit to the two curves gives slope 
=0.3595) and (e) q = 1 (the plot is not linear). 

TABLE I: Table lists slopes of the linearly fitted plots for different values of a before and after the multiplication of (1 — a) 
with the VL term of (i) {LogjVc + | - VL) verses LLogL (Slope 1), (u) [LogjVa + f - (1 - a)VL] verses LLogL [Slope 1(a)], 
(in) {LogNa - ^ - fLLogL) verses L (Slope 2) and (iv) [LogNa - (1 - a)y/L - \LLogL] verses L [Slope 2(a)]. 



a 


Slope 1 Slope 1(a) Slope 2 


Slope 2(a) 





0.499 0.499 


-0.5026 


-0.5026 


0.25 


0.4885 0.499 


-0.5353 


-0.5022 


0.5 


0.4767 0.4987 


-0.5683 


-0.5027 


0.75 


0.4624 0.4981 


-0.6025 


-0.5060 


1 


0.4556 0.5003 


-0.6331 


-0.4992 



1. Asymptotics for Ma 

Figure 1(a) shows the combined plot of the asymptotic expression of M (red dotted curve) with the numericaUy 
computed A/'q values for a = 0.75 (boxed curve). We have shown here for illustration, the plot for only a — 0.75. 
It is observed that as a is increased from to 1, the boxed curves (for different a's) shift downward continuously 
indicating an a dependence in Ma for the extended matrix model of RNA. We investigate this dependence in the 
following numerical analysis. 

Taking Log of N we get: LogN ^ ^LogL — ^ + \/i — j — Log^/2. We are interested in the large length L behaviour 
and we see that the dependence of Log A/" on L is strongest in LLogL. We linearly fit the plots (i) {LogMa — VL+ -j) 
verses LLogL (Slope 1, table|l|, (ii) {LogMa — ^/L—\LLogL) verses L (Slope 2, table|l| and (iii) (LogAfa + ^ — ^LLogL) 

verses \/L (fig. [2|) for different a and find their slopes. We find that there is a continuous decrease in the slopes as 
a goes from to 1 in the linearly fitted plots of (i) and (ii) (Slope 1 and Slope 2 respectively of table [l|, strongly 
suggesting a dependence of A/'q on a. In the fitted plots of (iii) we observe a remarkable behaviour for a — 0.75 and 
a = 1 plots [fig. [2jd) and fig.[2l^e)]. In the a = 0.75 plot [fig. [2l^d)], the points for odd and even lengths separate out 
into two very distinct curves and for the a = 1 plot [fig.[2lje)], the odd lengths vanish completely leaving only the even 
length points in the figure. This indicates that {LogNa + "§ ~ \LLogL) verses ^fh is very sensitive to changes in a. 

We try a factor of (1 — a) with the \/L term in the exponent of the M expression and then fit the plots: (i) 
[LogMa + ^ — (1 — oi)\/~L] verses LLogL [Slope 1(a), table[l] and (ii) [LogNa — (1 — a)^/L~ ^LLogL] verses L [Slope 
2(a), table H] for different values of a. We observe that now all the slopes are nearly the same and equal to +^ and 
for (i) and (ii) respectively. This proves that the factor of (1 — a) with the Vl term in the exponent of Af is 
the correct choice. We can therefore write the new asymptotic expression of the total number of diagrams at a fixed 
length L and a but independent of genus g, Ma for the extended matrix model as 
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FIG. 3: The asymptotic formula for aL,g in ^18j (black dotted curve) is plotted with the numerical aL,g,a values (green boxed 
curve) for different lengths L for a — 0.75. 

Note: The figure plots aL,g,a's for all genii corresponding to a particular length L of the polymer chain. The lowest curve 
(black dotted or green boxed) corresponds to genus g = for all the lengths (0 to 40) and the successive curves in the upward 
direction correspond to next higher genii with the maximum genus given by gmax ~ L/4. 



K = i*ea;p[-*+(i-")^-^]/V2. (4) 

We see from eq. (jH) that the total number of structures for the extended matrix model changes considerably for 
example, when a = 1 the term vanishes from the exponent. We repeat the exercise as before and plot the new 
asymptotic formula A/"^ for the extended matrix model of RNA given by eq. ([4]) [fig.[lja'), red dotted curve] with the 
numerically obtained A/'q values for different a's (represented by boxed curve, shown here for only a — 0.75). The 
plot for the new asymptotic formula coincides with the numerical data Ma confirming the new formula. 

2. Asymptotics for 

The plot (fig. [3]) of the asymptotic formula for a^^g (black dotted curve) with the numerically calculated aL,g,a 
values (green boxed curve) for different a's (shown for a = 0.75) clearly indicates that the asymptotic formula of the 
model in needs to be changed to give the asymptotic behaviour of the extended matrix model of RNA folding ■ 
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TABLE IL Table lists the measures of slopes for different values of a obtained from the linear fits to the plots between 
L and Log aL,g=i,a (Slope f), the x{a) values for each a and slopes from the linear fit of plots between Log L and [Log 
LLog{3 — a)] for each a (Slope 2). 



a Slope 1 


x{a) 


Slope 2 


1.198 


3.313 


1.646 


0.25 1.109 


3.03 


1.639 


0.5 1.012 


2.75 


1.633 


0.75 0.9065 


2.476 


1.623 


1 0.7891 


2.2 


1.655 


Analytical 1.24 


3.4556 


1.495 



The curves for different a's (shown here for only a — 0.75, fig. [3]) move further and further away from the asymptotic 
expression curve as a goes from to 1. This behaviour is studied and the correct asymptotic expression a'^^g^a 
for the extended matrix model is found. 

We start with the asymptotic expression of a^.g = fcgS^i^^^"^-', where kg = ^^^^s^ \ Taking Log on both 

of the sides and fixing g — \ (for simplicity) we get Log{aL.g=i) ^ Log , b . ^ h LLogS + ^LogL. In Log (a^.g^i), 

L dependence is present in the form of L and LogL. We are interested in the large L behaviour so we first look for 
the dominant L dependence. The linear fits to the plots of Log{aL^g^i,a) verses L in table [Tl] (Slope 1) shows that 
the slopes of the numerical aL.g.a curves for different a's are not the same and not equal to the slope of the aL,g 
asymptotic curve (slope should be Log 3 for a plot between Log{aL,g=i,a) and L according to [11]). This indicates 
an a dependence in the factor 3 of the 3^ universal part of aL,g which we represent by x{a) in table |TT] [where 
x{a) — Log{Slopel)]. We write the asymptotic formula by replacing 3 with x(a). The expression for a'^ ^ ^ after 
taking Log on both of the sides becomes Loga'j^ g a ^ Logkg + LLog[x{a)\ + \?>g — |] LogL. To determine the form 
of x(a), we plot x{a) verses a which is a straight line with slope = —1.133 and intercept — 3.466. In the same way 
as the asymptotic expression for a^.g in [ll] had the universal term 3^, we find x{a)^ to be x{a)^ — {—a + 3)^ for 
all a. We therefore have Loga'^ ^^^ ^ Logkg + LLog{S — a) +\3g — |] LogL. The universal 3^ part in the aL,g [III 

has been modified to (3 — a)^ for the extended matrix model [i^|. The asymptotic formula gets modified to a-L^g^a ~ 
A:g(3-a)^L(3g-i)^ 

Analyzing the Log(L) dependence now, we assume that there exists an a dependence in the exponent of L which 
we represent by /(a). We can therefore write from the modified equation after taking Log on both the sides and 
substituting 5 — 1, Log{a'j^ g^i ^) ^ Log—^-sy^ VLLog{i~a)+^ [f{a)]LogL. Linear fitted plots of [Lo5(aL,g=i,Q) — 

LLog{3 — a)] verses Log (L) for different a values is shown in fig. 21 The figure shows a continuous separation of data 
points belonging to the even and odd lengths as a is increased from to 1. There are two distinct lines at small lengths 
L which merge into a single line at higher lengths L. For a — 1 the points for odd lengths vanish completely from the 
plot. The slopes [table [TTl Slope 2] show that the difference between analytical and numerical values for different a is 
0.01, which is small. The Log(L) term therefore shows no significant a dependence. So we fix /(a) = 1. This gives 
the asymptotic formula of the number of diagrams at a fixed length L, genus g and a, a'j^ ^ ^ for the extended matrix 
model of RNA as 

«ls,a-fc.(3-a)^i('^-i) (5) 

The asymptotic formula [eq. ([5])] thus obtained is plotted with the numerically found flL.g.a values for different a's 
[fig- 13 shown here for only a = 0.75] and it is seen that the formula matches with the numerical results for large L. To 
verify the final form of the formula, we substitute different a's and g = I'm eq. (O and plot [Loga'j^ ^~ LLog{3 — a)] 
verses LogL. The slopes are found to be 1.495 in all the cases. This result will hold for any genus g, though we have 
shown here the result for only 5=1. It is interesting to note here that the universality of a'j^ ^ ^ for the extended 
matrix model changes from 3^^ in [ll] to 2^ when a = 1 (the completely paired base region). 

The asymptotic behaviour of a^^g^a.n and Afa,n for the model with perturbation on n bases is the same as for the 
model with perturbation on all the bases except that a is replaced by ^ [as is evident from the expression of the 
exponential generating function G{t, N, a) given by eq. Q with ^ in place of a]. Thus we can write the asymptotic 
expressions of the genus distribution functions for a perturbation acting on n bases as 
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FIG. 4: [LogaL,g=i,a — LLog{3 — a)] verses Log L plots for different values of a, (a) a = 0, (b) a = 0.25, (c) a = 0.5, (d) 
a — 0.75 and (e) a — 1. The slopes for these values of a are listed in Table 2 (Slope 2). 



a 



L,g^a,n 



(6) 



and, 



K,n = i*exp[-*+(i-T)^-i]/V2. (7) 

The asymptotics for the extended matrix models therefore show marked changes in the presence of the perturbation 
in the universal term of aL,g and in the total number of structures A/" of the model in p^ . 



IV. CONCLUSIONS 

In this work, we develop on the footsteps of the extended matrix model of RNA folding proposed in the effect of 
an external perturbation on only one nucleotide in the polymer chain of length L. We argue that a in the exponential 
generating function of the partition function of model in iIQ] will be replaced by ^ if perturbation acts on only one 
nucleotide in the chain. Further, we generalize this result to a finite number n < L of perturbations on the nucleotides 
of the chain, where a in the exponential generating function of the partition function gets replaced by na/L, eq. 

Next, we find numerically the asymptotic behaviour of the genus distribution functions for the extended matrix 
model of RNA folding in [l^| and the n-NP model. We find from the numerical analysis that the universality of 
aL,g, 3^ found in [l^], changes to (3 — a)^ when the perturbation acts on all the bases in the polymer chain [which 
becomes (3 — when the perturbation is on n bases]. The power law term L^s-f oi aL,g [H remains the same 

for the asymptotic formula of ^ ^ in the extended matrix models with perturbation on all the bases [l6j and on 

n bases. The total number of diagrams M also changes from its form in [3] to A/"^ = L'5'exp[~~^^^""'^~^]/V2 
with the term exp^ in [l8| changing to exp'^^"-'^ for the matrix model with perturbation on all the bases [which 
becomes exp^^~~^^ when the perturbation is on n bases]. The most striking change found in the universality of 
^'l g a '^^ when a takes the value 1 (and n = L) as the universality goes from 3^ to 2^ and in the (1 — a)\/L term in 
the exponent of A/'q which goes to zero when a = 1 and n ~ L. It is shown in fig. [2] and fig. [4] that as a is increased 
from to 1 in steps of 0.25, the points corresponding to even and odd lengths of the chain start splitting up into two 
different curves at small lengths, but converge into a single linear curve as the length is increased. Note that at small 
lengths, this difference is most pronounced for a = 0.75, for both Ma and aL.g,a- The a — 1 plots of Ma and aL,g,a 
[fig-Ute) and fig. IH^e) respectively] show the absence of odd length data points. It is interesting to note that the genus 
distributions show different behaviour at small and large lengths (analysis has been done for L = 40). The large L 
(asymptotic) behaviour of the distribution functions [eq. ([3]), (5), (6) and (7)] found for the RNA matrix model with 
external perturbation show prominent changes. 
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FIG. 5: The plot for the new asymptotic formula for a'L^g.a (black dotted curve) for the extended matrix model of RNA is 
shown with the numerically obtained aL,g,a for a = 0.75 (green boxed curve). 

We have studied the effect of an external perturbation on the RNA matrix model. In order to compare the results 
of the matrix model of RNA folding with external perturbations (discussed here and in [l^) with experiments (where 
the perturbations may be due to the constant forces discussed in the introduction or due to natural processes like 
transcription and translation taking place inside a living cell), a more detailed study will be undertaken. 
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